skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Carstens, Bryan C"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Orti, Guillermo (Ed.)
    While genetic variation in any species is potentially shaped by a range of processes, phylogeography and landscape genetics are largely concerned with inferring how environmental conditions and landscape features impact neutral intraspecific diversity. However, even as both disciplines have come to utilize SNP data over the last decades, analytical approaches have remained for the most part focused on either broad-scale inferences of historical processes (phylogeography) or on more localized inferences about environmental and/or landscape features (landscape genetics). Here we demonstrate that an artificial intelligence model-based analytical framework can consider both deeper historical factors and landscape-level processes in an integrated analysis. We implement this framework using data collected from two Brazilian anurans, the Brazilian sibilator frog (Leptodactylus troglodytes) and granular toad (Rhinella granulosa). Our results indicate that historical demographic processes shape most the genetic variation in the sibulator frog, while landscape processes primarily influence variation in the granular toad. The machine learning framework used here allows both historical and landscape processes to be considered equally, rather than requiring researchers to make an a priori decision about which factors are important. 
    more » « less
  2. Nice, Christopher (Ed.)
    The geographic distribution of genetic variation within a species reveals information about its evolutionary history, including responses to historical climate change and dispersal ability across various habitat types. We combine genetic data from salamander species with geographic, climatic, and life history data collected from open-source online repositories to develop a machine learning model designed to identify the traits that are most predictive of unrecognized genetic lineages. We find evidence of hidden diversity distributed throughout the clade Caudata that is largely the result of variation in climatic variables. We highlight some of the difficulties in using machine-learning models on open-source data that are often messy and potentially taxonomically and geographically biased. 
    more » « less
  3. Global climatic fluctuation has significantly impacted biodiversity by shaping adaptations across numerous species. Pleistocene climate changes notably affected species’ geographic distributions and population sizes, especially fostering post-glacial expansions in temperate regions. Evolutionary theory suggests spatial sorting of morphological traits associated with dispersal in recently expanded species. However, evidence of predicted intraspecific trait variation is scant. We investigated intraspecific trait variation in five lizard species along a forest-savanna gradient affected by Pleistocene climate. Lizards serve as an ideal group to test these ideas due to climate’s known influence on their morphological traits linked to essential functions like feeding and locomotion. We assessed two hypotheses: (i) niche variation and (ii) spatial sorting. For the niche variation hypothesis, we predicted increased intraspecific variability in head dimensions with distance from stable areas. For spatial sorting, we anticipated larger hind limb sizes with increased distance from stable areas. We gathered data on five quantitative traits from 663 samples across species. There was no evidence supporting either hypothesis across the five species. Limited sample sizes, challenges in habitat modeling, or other factors might explain this lack of support. Nonetheless, our study illuminates complexities in exploring trait variation within species. The data collected here, although inconclusive, represent a crucial test for evolutionary theory. 
    more » « less
  4. Staples, Anne Elizabeth (Ed.)
    Vocalizations in animals, particularly birds, are critically important behaviors that influence their reproductive fitness. While recordings of bioacoustic data have been captured and stored in collections for decades, the automated extraction of data from these recordings has only recently been facilitated by artificial intelligence methods. These have yet to be evaluated with respect to accuracy of different automation strategies and features. Here, we use a recently published machine learning framework to extract syllables from ten bird species ranging in their phylogenetic relatedness from 1 to 85 million years, to compare how phylogenetic relatedness influences accuracy. We also evaluate the utility of applying trained models to novel species. Our results indicate that model performance is best on conspecifics, with accuracy progressively decreasing as phylogenetic distance increases between taxa. However, we also find that the application of models trained on multiple distantly related species can improve the overall accuracy to levels near that of training and analyzing a model on the same species. When planning big-data bioacoustics studies, care must be taken in sample design to maximize sample size and minimize human labor without sacrificing accuracy. 
    more » « less
  5. Ruane, Sara (Ed.)
    Abstract Comparisons of intraspecific genetic diversity across species can reveal the roles of geography, ecology, and life history in shaping biodiversity. The wide availability of mitochondrial DNA (mtDNA) sequences in open-access databases makes this marker practical for conducting analyses across several species in a common framework, but patterns may not be representative of overall species diversity. Here, we gather new and existing mtDNA sequences and genome-wide nuclear data (genotyping-by-sequencing; GBS) for 30 North American squamate species sampled in the Southeastern and Southwestern United States. We estimated mtDNA nucleotide diversity for 2 mtDNA genes, COI (22 species alignments; average 16 sequences) and cytb (22 species; average 58 sequences), as well as nuclear heterozygosity and nucleotide diversity from GBS data for 118 individuals (30 species; 4 individuals and 6,820 to 44,309 loci per species). We showed that nuclear genomic diversity estimates were highly consistent across individuals for some species, while other species showed large differences depending on the locality sampled. Range size was positively correlated with both cytb diversity (phylogenetically independent contrasts: R2 = 0.31, P = 0.007) and GBS diversity (R2 = 0.21; P = 0.006), while other predictors differed across the top models for each dataset. Mitochondrial and nuclear diversity estimates were not correlated within species, although sampling differences in the data available made these datasets difficult to compare. Further study of mtDNA and nuclear diversity sampled across species’ ranges is needed to evaluate the roles of geography and life history in structuring diversity across a variety of taxonomic groups. 
    more » « less
  6. Abstract Intraspecific genetic diversity is a key aspect of biodiversity. Quaternary climatic change and glaciation influenced intraspecific genetic diversity by promoting range shifts and population size change. However, the extent to which glaciation affected genetic diversity on a global scale is not well established. Here we quantify nucleotide diversity, a common metric of intraspecific genetic diversity, in more than 38,000 plant and animal species using georeferenced DNA sequences from millions of samples. Results demonstrate that tropical species contain significantly more intraspecific genetic diversity than nontropical species. To explore potential evolutionary processes that may have contributed to this pattern, we calculated summary statistics that measure population demographic change and detected significant correlations between these statistics and latitude. We find that nontropical species are more likely to deviate from neutral expectations, indicating that they have historically experienced dramatic fluctuations in population size likely associated with Pleistocene glacial cycles. By analyzing the most comprehensive data set to date, our results imply that Quaternary climate perturbations may be more important as a process driving the latitudinal gradient in species richness than previously appreciated. 
    more » « less
  7. Research in the biological sciences is hampered by the Linnean shortfall, which describes the number of hidden species that are suspected of existing without formal species description. Using machine learning and species delimitation methods, we built a predictive model that incorporates some 5.0 × 10 5 data points for 117 species traits, 3.3 × 10 6 occurrence records, and 9.1 × 10 5 gene sequences from 4,310 recognized species of mammals. Delimitation results suggest that there are hundreds of undescribed species in class Mammalia. Predictive modeling indicates that most of these hidden species will be found in small-bodied taxa with large ranges characterized by high variability in temperature and precipitation. As demonstrated by a quantitative analysis of the literature, such taxa have long been the focus of taxonomic research. This analysis supports taxonomic hypotheses regarding where undescribed diversity is likely to be found and highlights the need for investment in taxonomic research to overcome the Linnean shortfall. 
    more » « less
  8. Fountain-Jones, Nicholas M; Smith, Megan L; Austerlitz, Frédéric (Ed.)
    Abstract The discipline of phylogeography has evolved rapidly in terms of the analytical toolkit used to analyse large genomic data sets. Despite substantial advances, analytical tools that could potentially address the challenges posed by increased model complexity have not been fully explored. For example, deep learning techniques are underutilized for phylogeographic model selection. In non‐model organisms, the lack of information about their ecology and evolution can lead to uncertainty about which demographic models are appropriate. Here, we assess the utility of convolutional neural networks (CNNs) for assessing demographic models in South American lizards in the genusNorops. Three demographic scenarios (constant, expansion, and bottleneck) were considered for each of four inferred population‐level lineages, and we found that the overall model accuracy was higher than 98% for all lineages. We then evaluated a set of 26 models that accounted for evolutionary relationships, gene flow, and changes in effective population size among the four lineages, identifying a single model with an estimated overall accuracy of 87% when using CNNs. The inferred demography of the lizard system suggests that gene flow between non‐sister populations and changes in effective population sizes through time, probably in response to Pleistocene climatic oscillations, have shaped genetic diversity in this system. Approximate Bayesian computation (ABC) was applied to provide a comparison to the performance of CNNs. ABC was unable to identify a single model among the larger set of 26 models in the subsequent analysis. Our results demonstrate that CNNs can be easily and usefully incorporated into the phylogeographer's toolkit. 
    more » « less
  9. Phylogenetic estimation under the multispecies coalescent model (MSCM) assumes all incongruence among loci is caused by incomplete lineage sorting. Therefore, applying the MSCM to datasets that contain incongruence that is caused by other processes, such as gene flow, can lead to biased phylogeny estimates. To identify possible bias when using the MSCM, we present P2C2M.SNAPP. P2C2M.SNAPP is an R package that identifies model violations using posterior predictive simulation. P2C2M.SNAPP uses the posterior distribution of species trees output by the software package SNAPP to simulate posterior predictive datasets under the MSCM, and then uses summary statistics to compare either the empirical data or the posterior distribution to the posterior predictive distribution to identify model violations. In simulation testing, P2C2M.SNAPP correctly classified up to 83% of datasets (depending on the summary statistic used) as to whether or not they violated the MSCM model. P2C2M.SNAPP represents a user-friendly way for researchers to perform posterior predictive model checks when using the popular SNAPP phylogenetic estimation program. It is freely available as an R package, along with additional program details and tutorials. 
    more » « less